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Safe Harbor Statement 

The following is intended to provide some insight into a line of research in Oracle Labs. It 
is intended for information purposes only, and may not be incorporated into any contract. 
It is nota commitmentto deliver any material, code, or functionality, and should not be 
relied upon in making purchasing decisions. Oracle reserves the right to alter its 
development plans and practices at any time, and the development, release, and timing 
of any features or functionality described in connection with any Oracle product or 
service remains at the sole discretion of Oracle. Any views expressed in this presentation 
are my own and do not necessarily reflect the views of Oracle. 
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Ruby 


Imperative 
'Scripting' (Perl) 
Object-oriented (Smalltalk) 
Batteries included 
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def delete _entry (key , options) 
if File . exist? ( key) 

begin 

File. delete (key) 

delete_empty_directories (File. dirname(/cey) ) 

true 

rescue => e 

# Just in case the error was caused by 

# another process deleting the file first, 
raise e if File. exist? (key) 

false 

end 

end 

end 
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MRI 


Simple bytecode interpreter 

Implemented in C 

Core library implemented in C 
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The JRuby logo is copyright 

JRuby 

JITs by emitting JVM bytecode 
VM in Java 

Core library mostly in Java 


Tony Price 2011, licensed under the terms of Creative Commons Attribution-NoDerivs 3.0 Unported (CC BY-ND 3.0) 
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The Rubinius logo is copyright 2011 Shane Becker, licensed under the terms of Creative Commons Attribution-NoDerivatives 4.0 International — CC BY-ND 4.0 

Rubinius 


JITs by emitting LLVM IR 
VM in C++ 

Core library mostly in Ruby 
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+ Truffle and Graal 
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Compatibility with the 
language (spec/ruby) 


Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 



90 °/ 


ORACLE 


Compatibility with the 
core library (spec/ruby) 
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Why aren't you using more of JRuby? 
Such as the existing Java implementation 

of the core library? 
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What makes Ruby difficult to 
optimise ? 
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How do people want to 
write Ruby? 
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def clamp (num, min, max) 
min, num, max|.sort[l] 
end 
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def include? (value) 

if value. is_a?( : -.Range) 

# 1...10 includes 1..9 but it does not include 1..10. 
operator = exclude_end? && ! value. exclude_end? ? :< : :<= 
super (value. fir st) && value.\ast.send(operator, last) 

else 

super 

end 

end 
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class Object 

# An object is blank if it } s false , empty, or a whitespace string. 

# For example, ’ ’, ’ ’, +nil+, [] , and {} are all blank. 

def blank? 

respond_to?( : empty?) ? !! empty? : ! self 
end 
end 
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def hard_mix(fg, bg, opts={}) 
return apply_opacity (fg, opts) 
if fully_transparent? (bg) 

return bg if fully_transparent? (f g) 

mix_alpha, dst_alpha = calculate_alphas ( 
fg, bg, DEFAULT_0P1 . merge (opts) ) 

new_r = blend_channel (r (bg) , (r(bg) 

+ r(fg) <= 255) ? 0 : 255, mix_alpha) 
new_g = blend_channel (g(bg) , (g(bg) 

+ g(fg) <= 255) ? 0 : 255, mix_alpha) 
new_b = blend_channel(b(bg) , (b(bg) 

+ b(fg) <= 255) ? 0 : 255, mix_alpha) 

rgba(new_r, new_g, new_b, dst_alpha) 

end 

def method_missing (method, *args, feblock) 
return ChunkyPNG :: Color . send (method, *args) 
if Chunky PNG : : Color . respond_to? (method) 
normal (*args) 
end 
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class Duration 

attr_accessor lvalue 

def initialize (value) 

©value = value 
end 

def as_json 
• • • 
end 

def inspect 
• • • 
end 

def method_mis sing (method, *args, feblock) 
value . send (method, *args, &block) 
end 
end 
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def grayscale_entry (bit_depth) 
value = ChunkyPNG : : Canvas . send ( 

: "decode_png_resample_# bit_depth bit_value 
content . unpack ( } n 1 ) [0] ) 

Chunky P; : : Color . grayscale (value) 
end 
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def delegate (method) 
method_def = ( 

"def #-fmethodJ- (*args, &block)\n" + 

" delegated. #1 method (*args, &block)\n" + 
"end" 

) 

module.eval (method_def , file , line) 
end 
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# 

# Executes the generated ERB code to produce a completed template, 

# the results of that code. (See ERB:: new for details on how this 

# can be affected by _safe_level_. ) 

# 

# _b_ accepts a Binding object which is used to set the context of 

# code evaluation. 

# 

def result (b=new_toplevel) 
if @safe_level 
proc { 

$SAFE = @safe_level 

eval(@src, b, (@filename || ' ( e rb) ' ) , (alineno) 

}. call 
else 

eval(@src, b, (^filename || 1 ( e rb ) ' ) , @lineno) 
end 
end 
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returning 

process 



Why can't a conventional VM 
optimise this? 

Why can't JRuby make this as fast 
as we want? 
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First problem: JRuby's core library is 

megamorphic 
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@JRubyMethod(name = "+") 

public IRubyObject op_plus(ThreadContext context, IRubyObject other) { 

if (other instanceof RubyFixnum) { 

return addFixnum( context, (RubyFixnum) other); 

> 

if (other instanceof RubyBignum) { 

return ((RubyBignum) other) .op_plus( context, this); 

> 

if (other instanceof RubyFloat) { 
return context. runtime. newFloat( 

(double) value + ((RubyFloat) other) ,getDoubleValue( )) ; 

} 

return coerceBin(context, "+", other); 

> 
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Second problem: JRuby's core library 

stateless 
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@JRubyMethod(name = "send") 

public IRubyObject sendl9(ThreadContext context, IRubyObject arg0, Block block) { 
String name = RubySymbol. objectToSymbolString(arqQ ) ; 

DynamicMethod method = getMetaClass( ) . sea rchMethod( name) ; 

if (getMetaClass( ) .shouldCallMethodMissing(method) ) { 
return Helpers. callMethodMissing(context, this, 

method. getVisibility( ) , name, CallType. FUNCTIONAL, block); 

> 

return method. call (context, this, getMetaClass( ) , name, block); 

> 
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Third problem: JRuby's core library is 

very deep 
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@jRubyMethod(name = "sort") 

public IRubyObject sort(ThreadContext context, Block block) { 
modify ( ) ; 

if (realLength > 1) { 

return sortlnternaKcontext, block); 

> 

return this; 

> 
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private IRubyObject sortInternal(final ThreadContext context, final Block block) { 

IRubyObject [] newValues = new IRubyObject [realLength] ; 
int length = realLength; 

safeArrayCopy(values, begin, newValues, 0, length); 

Qsort. sort (newValues, 0, length, new Comparator ) { 
public int compare (Object ol, Object o2) { 

IRubyObject objl = (IRubyObject) ol; 

IRubyObject obj2 = (IRubyObject) o2; 

IRubyObject ret = block. yieldAr ray (context, getRuntime( ) . newArray(obj 1, obj2), null); 
return RubyComparable.cmpi/?t(context, ret, objl, ob j 2 ) ; 

> 

>); 

values = newValues; 
begin = 0; 

realLength = length; 
return this; 

> 
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private static void quicksort_loop(Object [] a, int lo, int hi, Comparator c) { 
final ArrayListcint []> stack = new ArrayListcint [] >( 16) ; 

int[] entry = new int [2]; 
entry [0] = lo; 
entry[l] = hi; 

while (! stack. isEmptyO || entry != null) { 

if (entry == null) { 

entry = stack. remove(stack. size( ) - 1); 

> 

lo = entry [0] ; 
hi = entry [1] ; 

int midi = lo + (hi - lo) / 2; 

Object mid = a [midi] ; 

Object ml; 

Object m3; 

// do median of 7 if the array is over 200 elements. 

if ( (hi - lo) >= 200) { 
int t = (hi - lo) / 8; 

ml = med3( a[lo + t], a[lo + t * 2], a[lo + t * 3], c); 
m3 = med3( a [midi + t], a [midi + t * 2], a [midi + t * 3], c); 

} else { 

// if it's less than 200 do median of 3 

int t = (hi - lo) / 4; 
ml = a [lo + t] ; 
m3 = a [midi + t] ; 

} 

mid = med3{ ml, mid, m3, c); 
if (hi - lo >= 63) { 
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Fourth problem: JRuby's core library 

isn't amenable to optimisations 
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private static void quicksort_loop(Object [] a, int lo, int hi, Comparator c) { 
final ArrayListcint []> stack = new ArrayListcint [] >( 16) ; 

int[] entry = new int [2]; 
entry [0] = lo; 
entry[l] = hi; 

while (! stack. isEmptyO || entry != null) { 

if (entry == null) { 

entry = stack. remove(stack. size( ) - 1); 

> 

lo = entry [0] ; 
hi = entry [1] ; 

int midi = lo + (hi - lo) / 2; 

Object mid = a [midi] ; 

Object ml; 

Object m3; 

// do median of 7 if the array is over 200 elements. 

if ( (hi - lo) >= 200) { 
int t = (hi - lo) / 8; 

ml = med3( a[lo + t], a[lo + t * 2], a[lo + t * 3], c); 
m3 = med3( a [midi + t], a [midi + t * 2], a [midi + t * 3], c); 

} else { 

// if it's less than 200 do median of 3 

int t = (hi - lo) / 4; 
ml = a [lo + t] ; 
m3 = a [midi + t] ; 

} 

mid = med3{ ml, mid, m3, c); 
if (hi - lo >= 63) { 
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The same problems apply to Rubinius, 
even though the core library is mostly 

written in Ruby 
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def isort! {left, right) 
i = left + 1 
while i < right 

9 • 

J = 1 

while j > left 

JP = J - 1 

eli = at(j'p) 

eI2 = at(j) 

cmp = (ell <=> eZ2) 

break unless cmp > 0 
self[j] = ell 

selftjp] = eZ2 

• • 

J - JP 

end 

i += 1 

end 

end 


Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 


ORACLE 



Fixnum* Fixnum: : compare (STATE, Fixnum* other) { 
native_int left = to_native(); 
native_int right = other->to_native( ) ; 
if (left == right) { 

return Fixnum: :from(0) ; 

} else if (left < right) { 
return Fixnum: :from(-l) ; 

> else { 

return Fixnum: :from(l) ; 

> 
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public static native void a rraycopy( Object src, int s rcPos, 

Object dest, int destPos, 
int length); 


Copyright © 2016, Oracle and/or its affiliates. All rights reserved. 


ORACLE 



Interlude: Truffle and Graal 
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Hotspot 
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* 


Hotspot 
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Hotspot 
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x + y 


z 
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load_local x 
load_local y 
load_local z 
call :* 
call :+ 

pushq 
movq 
movq 
movq 
movq 
movq 
movl 
movq 
imull 
movq 
addl 
popq 
ret 
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%rbp 

%rspj %rbp 
%rdij -8(%rbp) 
%rsij -16(%rbp) 
%rc\Xj -24(%rbp) 
-16(%rbp) j %rax 
%eaXj %edx 
-24(%rbp) j %rax 
%edx } %eax 
-8(%rbp) j %rdx 
%edx } %eax 
%rbp 



x + y 


z 
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load_local x 
load_local y 
load_local z 
call :* 
call :+ 

pushq 
movq 
movq 
movq 
movq 
movq 
movl 
movq 
imull 
movq 
addl 
popq 
net 
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%rbp 

%rsp } %rbp 
%ndij -8(%rbp) 
%rsij -16(%rbp) 
%rdXj -24(%rbp) 
-16(%rbp), %rax 
%eaXj %edx 
-24(%rbp) j %rax 
%edXj %eax 
-8(%rbp) j %rdx 
%edXj %eax 
%nbp 




JIT 
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Hotspot 
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Truffle 


It 


JIT 
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Hotspot 
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AST Interpreter 
Uninitialized Nodes 


T. Wurthinger, C. Wimmer, A. WoR, L. Stadler, G. Duboscq,C. Humer, G. Richards, D. Simon, 
and M. Wolczko. OneVM to rulethem all. In Proceedingsof Onward!, 2013. 
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AST Interpreter 
Uninitialized Nodes 


ORACLE 


Node Rewriting 
for Profiling Feedback 




T. Wurthinger, C. Wimmer, A. WoR, L. Stadler, G. Duboscq,C. Humer, G. Richards, D. Simon 
and M. Wolczko. OneVM to rulethem all. In Proceedingsof Onward!, 2013 
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@NodeInfo(shortName = "+") 

public abstract class SLAddNode extends SLBinaryNode { 

public SLAddNode(SourceSection src) { 
super(src) ; 

} 

@Specialization(rewriteOn = ArithmeticException . class) 

protected long add(long left, long right) { 
return ExactMath . addExact(\eft , right); 

} 

^Specialization 

@TruffleBoundary 

protected Biglnteger add(BigInteger left, Biglnteger right) { 
return left . add(right) ; 

} 

@Specialization(guards = "isString(left , right)") 

@TruffleBoundary 

protected String addCObject left, Object right) { 
return left . toStringO + right . toStringO ; 

} 

protected boolean isString(Object a, Object b) { 

return a instanceof String I I b instanceof String; 

} 
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@NodeInfo(shortName = "eval") 

public abstract class SLEvalBuiltin extends SLBuiltinNode { 

@SuppressWarnings(" unused") 

@Specialization(guards = { 

"stringsEqual(mimeType, cachedMimeType)" , 

"stringsEqual(code , cachedCode)" 

}) 

public Object evalCached(VirtualFrame frame, 

String mimeType, String code, 

@Cached("mimeType") String cachedMimeType, 

@Cached("code") String cachedCode, 

@Cached("create(parse(mimeType, code))") DirectCallNode callNode) { 
return callNode. call(frame, new Object[]{}); 

} 

@TruffleBoundary 

@Specialization(contains = "evalCached") 
public Object evall)ncached(String mimeType, String code) { 
return parse(mimeType, code) . call() ; 

} 

} 
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AST Interpreter 
Rewritten Nodes 
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Compilation using 
Partial Evaluation 



Compiled Code 


T. Wurthinger, C. Wimmer, A. WoR, L. Stadler, G. Duboscq,C. Humer, G. Richards, D. Simon 
and M. Wolczko. OneVM to rulethem all. In Proceedingsof Onward!, 2013 
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codon.com/compilers-for-free 
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Presentation, by Tom Stuart, licensed under a Creative Commons Attribution ShareAlike3.0 
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AST Interpreter 
Uninitialized Nodes 



Compilation using 
Partial Evaluation 



Compiled Code 


T. Wurthinger, C. Wimmer, A. WoR, L. Stadler, G. Duboscq,C. Humer, G. Richards, D. Simon, 
and M. Wolczko. OneVM to rulethem all. In Proceedingsof Onward!, 2013. 
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Deoptimization 
to AST Interpreter 
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Node Rewriting to Update 
Profiling Feedback 
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Recompilation using 
Partial Evaluation 
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double 
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x + y 


z 
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load_local x 
load_local y 
load_local z 
call :* 
call :+ 

pushq 
movq 
movq 
movq 
movq 
movq 
movl 
movq 
imull 
movq 
addl 
popq 
ret 
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%rbp 

%rspj %rbp 
%rdij -8(%rbp) 
%rsij -16(%rbp) 
%rc\Xj -24(%rbp) 
-16(%rbp) j %rax 
%eaXj %edx 
-24(%rbp) j %rax 
%edx } %eax 
-8(%rbp) j %rdx 
%edx } %eax 
%rbp 



x + y 


z 
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load_local x 
load_local y 
load_local z 
call :* 
call :+ 

pushq 
movq 
movq 
movq 
movq 
movq 
movl 
movq 
imull 
movq 
addl 
popq 
net 
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%rbp 

%rsp } %rbp 
%ndij -8(%rbp) 
%rsij -16(%rbp) 
%rdXj -24(%rbp) 
-16(%rbp), %rax 
%eaXj %edx 
-24(%rbp) j %rax 
%edXj %eax 
-8(%rbp) j %rdx 
%edXj %eax 
%nbp 



Will I be able to use Truffle 
and Graalfor real? 
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JVMCI 

(JVM Compiler Interface) 


ORACLE 


JS 


R 


Ruby 


Truffle 


Java 


Graal 


Hotspot 



C++ 
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Truffle 

Graal 


Hotspot 


ORACLE 


via Maven etc 




Java 9 
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Overview 
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Learn More 


Oracle Labs GraalVM & Truffle/JS Downloads 


Thank you for downloading this release of the Oracle Labs GraalVM & Truffle/JS. With this release, 
one can execute Java applications with Graal, as well as JavaScript applications with our Truffle- 
based JavaScript engine. 


Thank you for accepting the OTN License Agreement; you may now download this software. 

♦ Preview for Linux (v0.5) 

S Preview for Mac OS X (v0.5) 


How to install GraalVM 

Unpack the downloaded *.tar.gz file on your machine. You can then use the java and the trufflejs 
executables to execute Java and Javascript programs. Both are in the bin directory of GraalVM. 
Typically, you want to add that directory to your path. 

More detailed getting started instructions are available in the README file in the download. 
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WARNING: This release contains older versions of the JRE and JDK that are provided to help 
developers debug issues in older systems. They are not updated with the latest security patches and 
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How Truffle solves the problem 
of optimising Ruby 
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First problem: JRuby's core library is 

megamorphic 
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Uninitialized Integer 




Double 


Generic 



# 


# 


0 

0 


# 

0 


T. Wurthinger, C. Wimmer, A. WoR, L. Stadler, G. Duboscq,C. Humer, G. Richards, D. Simon 
and M. Wolczko. OneVM to rulethem all. In Proceedingsof Onward!, 2013 


Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 



@Specialization( rewriteOn = ArithmeticException. class) 

public int add(int a, int b) { 

return ExactMath.addExactfa, b); 

} 

@Specialization( rewriteOn = ArithmeticException. class) 

public long add (long a, long b) { 
return ExactMath.ac/dExactfa, b); 

} 

^Specialization 

public Object addWithOverf lowflong a, long b) { 

return f ixnumOrBignum(BigInteger. valueOf (a) ,add(BigInteger. valueOf (b) ) ) ; 

} 

(^Specialization 

public double add(long a, double b) { 
return a + b; 

} 
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Second problem: JRuby's core library 

stateless 
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T. Wurthinger, C. Wimmer, A. WoR, L. Stadler, G. Duboscq,C. Humer, G. Richards, D. Simon 
and M. Wolczko. OneVM to rulethem all. In Proceedingsof Onward!, 2013 
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@CoreMethod( names = "send", needsBlock = true, rest = true, required = 1) 
public abstract static class SendNode extends CoreMethodArrayArgumentsNode { 

@Child private CallDispatchHeadNode dispatchNode; 

public SendNode (RubyContext context, SourceSection sourceSection) { 
super(context, sourceSection) ; 

dispatchNode = new CallDispatchHeadNode(context, true, 

M is s in g Beh a v io r . CALL_METHOD_MISSING ) ; 

} 

^Specialization 

public Object send(VirtualFrame frame, Object self, Object name, 

Object!] args, DynamicObject block) { 
return dispatchNode. calKframe, self, name, block, args); 

> 

} 
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public static class IntegerArrayBuilderNode extends ArrayBuilderNode { 

private final int expectedLength; 

public IntegerArrayBuilderNode(RubyContext context, int expectedLength) { 
super(context ) ; 

this. expectedLength = expectedLength; 

} 

^Override 

public Object startO { 

return new int [expectedLength] ; 

} 
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Third problem: JRuby's core library is 

very deep 
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Fourth problem: JRuby's core library isn't 

amenable to optimisations 
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@CoreMethod( names = "sort", needsBtock = true) 

public abstract class SortNode extends ArrayCoreMethodNode { 

(achild private CallDispatchHeadNode compareDispatchNode; 

@ExplodeLoop 

(^Specialization 

public DynamicObject sortVeryShort(VirtualFrame frame, DynamicObject array) { 
final int size = getSize(array) ; 

// Copy with a exploded loop for PE 

for (int i = 0; i < getContext( ) .getOptions( ) . ARRAY_SMALL; i++) { 
if (i < size) { 

store. set (i, originalStore.get(i) ) ; 

> 

} 

// Selection sort - written very carefully to allow PE 

for (int i = 0; i < getContext( ) .getOptions( ) . ARRAY_SMALL; i++) { 
if (i < size) { 

for (int j = i + 1; j < getContext( ) .getOptions( ) .ARRAY_SMALL; j++) { 
if (j < size) { 

final Object a = store. get(i) ; 
final Object b = store. get ( j ) ; 

if (((int) compareDispatchNode. calKframe, b, "<=>", null, a)) < 0) { 
store. set(j, a); 
store. set(i, b); 

} 

} 

} 

} 

} 

return createArray(getContext( ) , store, size); 

} 
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@ExplodeLoop 


// Selection sort - written very carefully to allow PE 

for (int i = 0; i < getContext( ) .getOptions( ) . ARRAY_SMALL; i++) { 
if (i < size) { 

for (int j = i + 1; j < getContext( ) .getOptions( ) .ARRAY_SMALL; j++) { 
if (j < size) { 

final Object a = store. get ( i) ; 
final Object b = store. get ( j ) ; 

if (((int) compareDispatchNode.calKf rame, b, "<=>", null, a)) < 0) { 
store.set(j, a); 
store. set(i, b); 

} 

> 

} 

} 

} 
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A simple example 
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d e f m i n ( a , b ) 

[a , b] . sort [0] 
end 


puts min (2, 8) 


ORACLE 
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d -e- f— m i n ( a , b ) 

— [a — bH - sort [0] 

puts [2 , 8] . sort [0] 
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to = 2 <=> 8 

tl = t0 < 0 ? 
t2 = tO > 0 ? 
t3 = [tl, t2] 

puts t3[0] 


ORACLE 


2 : 8 
8 : 2 
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to = 2 <=> 8 

tl = t0 < 0 ? 
t2 = tO > 0 ? 
t3 = — \r^r ~, — 134 

puts tl 
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t0 = -1 

tl = t0 < 0 ? 


puts tl 
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to = - 1 

tl = -1 < 0 ? 


puts tl 
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tl = true ? 2 


puts tl 
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tl = 2 


puts tl 
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puts 2 
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puts 2 
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t0 = a <=> b 
tl = t0 < 0 ? 


puts tl 
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t0 = a < = > b 

tl = (a <=> b) 


puts tl 


ORACLE 


< 0 ? a 


b 
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tl = (a < = > b) 


puts (a <=> b) 
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< Q ? a : — b 


< 0 ? a : b 
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puts (a 
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<=> b) < 0 ? a : b 
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A deliberately extreme example 
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module Foo 

def self.foo(a, b, c) 

hash = {a: a, b: b, c: c} 
array = hash. map { Ik, v| v } 
x = array [0] 
y = [a, b, c] . sort [1] 
x + y 
end 
end 

class Bar 

def method_missing (method, *args) 
if Foo . respond_to? (method) 

Foo. send (method, *args) 
else 

0 

end 

end 

end 


bar = Bar. new 
loop do 

start = Time. now 
1_000_000. times do 
bar.foo(14, 8, 6) 
end 

puts Time. now - start 
end 
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module Foo 

def self.foo(a, b, c) 

hash = {a: a, b: b, c: c} 
array = hash. map { |k, v| v } 
x = array [0] 


y 

x 

end 

end 


[a, b, c] .sort [1] 

y 


class Bar 

def method_missing (method, *args) 
if Foo . respond_to? (method) 

Foo. send (method, *args) 
else 

0 

end 

end 

end 
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bar = Bar. new 


loop do 

start = Time. now 
1_000_000. times do 
bar. f oo (14, 8, 6) 
end 

puts ’ime.now - start 
end 
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module Foo 

def self.foo(a, b, c) 

hash = {a: a, b: b, c: c} 
array = hash. map { Ik, v| v } 
x = array [0] 
y = [a, b, c] . sort [1] 
x + y 
end 
end 

class Bar 

def method_missing (method, *args) 
if Foo . respond_to? (method) 

Foo. send (method, *args) 
else 

0 

end 

end 

end 


bar = Bar. new 
loop do 

start = Time. now 
1_000_000. times do 
bar.foo(14, 8, 6) 
end 

puts Time. now - start 
end 
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module Foo 

def self.foo(a, b, c) 

hash = {a: a, b: b, c: c} 
array = hash. map { Ik, v| v } 
x = array [0] 
y = [a, b, c] . sort [1] 
x + y 


if Foo . respond_to? (method) 
Foo. send (method, *args) 
else 

0 

end 

end 

end 


bar = Bar. new 


loop do 



puts Time. now - start 
end 
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module Foo 



0 

end 

end 

end 
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end 
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end 
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bar = Bar. new 



me . now 
mes do 
bar.foo(14, 8, 6) 
end 

puts ime.now - start 
end 


22 ! 
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baseline implementation (s/s) 



jruby-1 .7.20-indy 


rbx-2.5.5 


topaz-dev 
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Speedup relative to 
baseline implementation (s/s) 
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1 7769 FixedGuard(!=false) TransferToInterpreter 


Node 








1 7772 FixedGuard(!=false) TransferToInterpreter 


1 7775 FixedGuard(!=false) TransferToInterpreter 


jjgfl6 u C( 22 ) h 


18368 Box 


FT 


17602 Return 
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Control 

flow 


Node 


1 7769 FixedGuard(!=false) TransferToInterpreter 








1 7772 FixedGuard(!=false) TransferToInterpreter 


Data 

flow 



1 7775 FixedGuard(!=false) TransferToInterpreter 


jjgfl6 u C( 22 ) h 


18368 Box 


FT 


17602 Return 
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1 7769 FixedGuard(!=false) TransferToInterpreter 


17051 Loadlndexed 


ac 




17602 Return 






Mill FixedGuard(!=false) TransferToInterpreter 




17775 FixedGuard(!=false) TransferToInterpreter 
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movabs 0xlle2037a8, °/,rax ; 
• • • 
retq 
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{oop(a ’ java/lang/Integer’ = 22)} 
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C extensions 
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C extensions are a hack to workaround 
performance, but now they stop us 
really fixing performance 
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A lot of this has been about removing 
barriers to the excellent optimisations 

we already have 
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def clamp (num, min, max) 
min, num, max|.sort[l] 
end 
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VALUE psd_native_util_clamp(VALUE self, VALUE r_num, VALUE r_min, VALUE r_max) { 
int num = FIX2INT( r_num) ; 
int min = FIX2INT( r_min) ; 
int max = FIX2INT( r_max) ; 

return num > max ? r_max : (num < min ? r_min : r_num); 

> 
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def cmyk_to_rgb[c , m, y, k) 
Hash [ { 

r: (65535 - 
g: (65535 - 
b: (65535 - 
}.map { \k, v 
end 


(c * (255 - k) + {k « 8))) 

(m * (255 - k) + {k « 8))) 

(y * (255 - k) + (Ac « 8))) 

[Ac, Util, clamp (v, 0, 255)] 
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cmyk_to_rgb 



psd_native_util_clamp 


FIX2INT 
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C Extension Performance for psd_native and oily_png 



Matthias Grimmer, Chris Seaton, Thomas Wuerthinger, Hanspeter Moessenboeck: 

Dynamically Composing Languages in a Modular Way: Supporting C Extensions for Dynamic Languages 
Modularity '14 Proceedings of the 14th International Conference on Modularity 
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Conclusions 
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The blocker for performance of idiomatic 
Ruby code is the core library, not basic 

language features 
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This extends to everything that forms a 
barrier - including C extensions 
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Specialisation, splitting, inlining, partial 
evaluation, inline caching are all 
solutions to this problem 
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Truffle makes it easy to add these to a 
language implementation 
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Can result in an order of magnitude 
performance increase with reasonable 

effort 
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development plans and practices at any time, and the development, release, and timing 
of any features or functionality described in connection with any Oracle product or 
service remains at the sole discretion of Oracle. Any views expressed in this presentation 
are my own and do not necessarily reflect the views of Oracle. 
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